A Bayesian Network Model for Interesting Itemsets
نویسندگان
چکیده
Mining itemsets that are the most interesting under a statistical model of the underlying data is a commonly used and well-studied technique for exploratory data analysis, with the most recent interestingness models exhibiting state of the art performance. Continuing this highly promising line of work, we propose the first, to the best of our knowledge, generative model over itemsets, in the form of a Bayesian network, and an associated novel measure of interestingness. Our model is able to efficiently infer interesting itemsets directly from the transaction database using structural EM, in which the E-step employs the greedy approximation to weighted set cover. Our approach is theoretically simple, straightforward to implement, trivially parallelizable and retrieves itemsets whose quality is comparable to, if not better than, existing state of the art algorithms as we demonstrate on several real-world datasets.
منابع مشابه
Classification as Mining and Use of Labeled Itemsets
We investigate the relationship between association and classification mining. The main issue in association mining is the discovery of interesting patterns of the data, so called itemsets. We introduce the notion of labeled itemsets and derive the surprising result that classification techniques such as decision trees, Naïve Bayes, Bayesian networks and Nearest Neighbor classifiers can all be ...
متن کاملRisk Analysis of Operating Room Using the Fuzzy Bayesian Network Model
To enhance Patient’s safety, we need effective methods for risk management. This work aims to propose an integrated approach to risk management for a hospital system. To improve patient’s safety, we should develop flexible methods where different aspects of risk and type of information are taken into consideration. This paper proposes a fuzzy Bayesian network to model and analyze risk in the op...
متن کاملAn efficient Bayesian network approach for discovering interesting patterns
The main problem faced by all association rule/pattern mining algorithms is their production of a large number of rules which incurred a secondary mining problem; namely, mining interesting association rules/patterns. The problem is compounded by the fact that ‘common knowledge’ discovered rules are not interesting, but they are usually strong rules with high support and confidence levels – the...
متن کاملA Probabilistic Model for COPD Diagnosis and Phenotyping Using Bayesian Networks
Introduction: This research was meant to provide a model for COPD diagnosis and to classify the cases into phenotypes; General COPD, Chronic bronchitis, Emphysema, and the Asthmatic COPD using a Bayesian Network (BN). Methods: The model was constructed through developing the Bayesian Network structure and instantiating the parameters for each of the variables. In order to validate the achiev...
متن کاملA Surface Water Evaporation Estimation Model Using Bayesian Belief Networks with an Application to the Persian Gulf
Evaporation phenomena is a effective climate component on water resources management and has special importance in agriculture. In this paper, Bayesian belief networks (BBNs) as a non-linear modeling technique provide an evaporation estimation method under uncertainty. As a case study, we estimated the surface water evaporation of the Persian Gulf and worked with a dataset of observations ...
متن کامل